Thread Migration to Improve Synchronization Performance

نویسندگان

  • Srinivas Sridharan
  • Brett Keck
  • Richard Murphy
  • Surendar Chandra
  • Peter Kogge
چکیده

A number of prior research efforts have investigated thread scheduling mechanisms to enable better reuse of data in a processor’s cache. We propose to exploit the locality of the critical section data by enforcing an affinity between locks and the processor that has cached the execution state of the critical section protected by that lock. We investigate the idea of migrating threads to the “lock hot” processor, enabling the threads to reuse the critical section data from the processor’s cache and release the lock faster for other threads. We argue that this mechanism should improve the scalability of performance for highly multithreaded scientific applications. We test our hypothesis on a 4-way Itanium2 SMP running the 2.6.9 Linux kernel. We modified the Linux 2.6 kernel’s O(1) scheduler using information from the Futex (Fast User-space muTEX) mechanism in order to implement our policy. Using synthetic micro-benchmarks, we show 10-90% performance improvement in cpu cycles, L2 miss ratios and bus requests for applications that operate on significant amounts of data inside the critical section protected by locks. We also evaluate our policy for the SPLASH2 application suite.

برای دانلود رایگان متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Ecole Normale Supérieure De Lyon

This paper studies the use of threads to support the execution of data parallel programs The overhead induced by the multithreaded environment is experimentally studied global synchronization thread creation communication thread migration We propose some simple criteria to determine the right size of threads with respect to the expected overhead We use the PM multithreaded environment which pro...

متن کامل

Compiler Optimization of Value Communication for Thread-Level Speculation

In the context of Thread-Level Speculation (TLS), inter-thread value communication is the key to efficient parallel execution. From the compiler’s perspective, TLS supports two forms of inter-thread value communication: speculation and synchronization. Speculation allows for maximum parallel overlap when it succeeds, but becomes costly when it fails. Synchronization, on the other hand, introduc...

متن کامل

Comprehensive synchronization elimination for Java

Jonathan Aldrich, Emin Gün Sirer, Craig Chambers, and Susan J. Eggers Department of Computer Science and Engineering University of Washington Box 352350, Seattle WA 98195-2350 {jonal,egs,chambers,eggers}@cs.washington.edu Abstract In this paper, we describe three novel analyses for eliminating unnecessary synchronization that remove over 70% of dynamic synchronization operations on the majority...

متن کامل

Distributed Clustering and Scheduling of Object-Oriented Virtual Machines

This report presents an overview of several approaches to provide a Single System Image view of a cluster, particularly concerning the view of a single address space. The main focus of our work is to understand the current approaches for clustering a regular multithreaded and non-cluster-aware Java application, as well as the current techniques and metrics for scheduling threads in a heterogene...

متن کامل

Active Threads: an Extensible and Portable Light-Weight Thread System

This document describes a portable light-weight thread runtime system for uniand multiprocessors targeted at irregular applications. Unlike most other thread packages, which utilize hard-coded scheduling policies, Active Threads provides a general mechanism for building data structure specific thread schedulers and for composing multiple scheduling policies within a single application. This all...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2006